Conversation
ddaspit
left a comment
There was a problem hiding this comment.
@ddaspit reviewed 17 files and all commit messages, and made 6 comments.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on Enkidu93 and pmachapman).
src/SIL.Machine/QualityEstimation/Usability/UsabilityBase.cs line 1 at r1 (raw file):
namespace SIL.Machine.QualityEstimation.Usability
I don't think we need a separate Usabability namespace. Generally, I try to avoid too deep of a namespace hierarchy. I would just put all of the classes in the QualityEstimation namespace.
src/SIL.Machine/QualityEstimation/Usability/UsabilityBase.cs line 3 at r1 (raw file):
namespace SIL.Machine.QualityEstimation.Usability { public abstract class UsabilityBase
I would make all of the usability classes immutable.
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 13 at r1 (raw file):
/// Provides chrF3 quality estimation support for pre-translations. /// </summary> public class QualityEstimation
I would rename this to Chrf3QualityEstimator.
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 84 at r1 (raw file):
/// </summary> /// <param name="confidences">The confidence values.</param> public void EstimateQuality(Dictionary<string, double> confidences)
I would refactor this method to return the usability results rather than store it in the object. I think a more functional design would work here. Also, I would use Tuple<MultiKeyRef, double> as the type for confidences.
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 94 at r1 (raw file):
/// </summary> /// <param name="confidences">The confidence values.</param> public void EstimateQuality(Dictionary<VerseRef, double> confidences)
I would refactor this method to return the usability results rather than store it in the object. Also, I would use Tuple<ScriptureRef, double> as the type for confidences.
src/SIL.Machine/QualityEstimation/Scores/Score.cs line 3 at r1 (raw file):
namespace SIL.Machine.QualityEstimation.Scores { public class Score
I would make all of the Score classes internal and move them to the QualityEstimation namespace.
Enkidu93
left a comment
There was a problem hiding this comment.
In general, you've ported the code from silnlp quite closely. I was expecting this PR to be a little smaller/flatter with a basic chrf3 projection function in an estimator class. Do you think more-or-less porting the code to Machine is the right move given that you will be interacting with these scores differently in SF - i.e., retrieving them per pretranslation per corpus rather than from files etc.? I think it's totally fine to move all the code to Machine, but I wondered if you think this will be re-used in SF as-is. Of course, you're the one who will be using it so it is your call! But I think you have freedom to refactor however it'll be most helpful in SF while the core is still available to use in silnlp.
@Enkidu93 reviewed 17 files and all commit messages, and made 2 comments.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on ddaspit and pmachapman).
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 13 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would rename this to
Chrf3QualityEstimator.
Or maybe even ConfidenceChrf3...? Do you think we'll possibly have additional quality estimators down the road? I guess we can cross that bridge when we come to it.
pmachapman
left a comment
There was a problem hiding this comment.
@pmachapman made 6 comments.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on ddaspit).
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 13 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
Or maybe even
ConfidenceChrf3...? Do you think we'll possibly have additional quality estimators down the road? I guess we can cross that bridge when we come to it.
Done.
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 84 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would refactor this method to return the usability results rather than store it in the object. I think a more functional design would work here. Also, I would use
Tuple<MultiKeyRef, double>as the type forconfidences.
Done.
src/SIL.Machine/QualityEstimation/QualityEstimation.cs line 94 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would refactor this method to return the usability results rather than store it in the object. Also, I would use
Tuple<ScriptureRef, double>as the type forconfidences.
Done.
src/SIL.Machine/QualityEstimation/Scores/Score.cs line 3 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would make all of the
Scoreclassesinternaland move them to theQualityEstimationnamespace.
Done.
src/SIL.Machine/QualityEstimation/Usability/UsabilityBase.cs line 1 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I don't think we need a separate
Usababilitynamespace. Generally, I try to avoid too deep of a namespace hierarchy. I would just put all of the classes in theQualityEstimationnamespace.
Done.
src/SIL.Machine/QualityEstimation/Usability/UsabilityBase.cs line 3 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would make all of the usability classes immutable.
Done. I have made the class properties unable to be changed. It is a shame that .net standard 2.0 does not have the init modifier!
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #397 +/- ##
==========================================
+ Coverage 72.79% 73.00% +0.21%
==========================================
Files 424 439 +15
Lines 36309 36666 +357
Branches 5006 5037 +31
==========================================
+ Hits 26431 26768 +337
- Misses 8770 8782 +12
- Partials 1108 1116 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ddaspit
left a comment
There was a problem hiding this comment.
@ddaspit partially reviewed 22 files and all commit messages, made 15 comments, and resolved 6 discussions.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on mshannon-sil and pmachapman).
src/SIL.Machine/QualityEstimation/BookUsability.cs line 3 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { public class BookUsability : UsabilityBase
I would rename this to ScriptureBookUsability.
src/SIL.Machine/QualityEstimation/ChapterUsability.cs line 3 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { public class ChapterUsability : BookUsability
I would rename this to ScriptureChapterUsability.
src/SIL.Machine/QualityEstimation/ChrF3QualityEstimation.cs line 11 at r2 (raw file):
/// Provides chrF3 quality estimation support for pre-translations. /// </summary> public class ChrF3QualityEstimation
Generally, in Machine, we've named classes that perform an operation with agent nouns, i.e. Chrf3QualityEstimator.
src/SIL.Machine/QualityEstimation/ChrF3QualityEstimation.cs line 13 at r2 (raw file):
public class ChrF3QualityEstimation { private readonly BookScores _bookScores = new BookScores();
Can we pass the scores through the different method calls instead of storing them as field?
src/SIL.Machine/QualityEstimation/SequenceUsability.cs line 3 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { public class SequenceUsability : TxtFileUsability
I would rename this to TextSegmentUsability.
src/SIL.Machine/QualityEstimation/SequenceUsability.cs line 17 at r2 (raw file):
} public int SequenceNumber { get; }
This should store the ref.
src/SIL.Machine/QualityEstimation/TxtFileUsability.cs line 3 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { public class TxtFileUsability : UsabilityBase
I would rename this to TextUsability.
src/SIL.Machine/QualityEstimation/TxtFileUsability.cs line 11 at r2 (raw file):
} public string TargetDraftFile { get; }
I would rename this to TextId.
src/SIL.Machine/QualityEstimation/VerseUsability.cs line 3 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { public class VerseUsability : ChapterUsability
I would rename this to ScriptureSegmentUsability. This doesn't necessarily store the usability for a verse.
src/SIL.Machine/QualityEstimation/VerseUsability.cs line 18 at r2 (raw file):
} public string Verse { get; }
This should store the ScriptureRef.
src/SIL.Machine/QualityEstimation/BookScores.cs line 5 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { internal class BookScores
I would rename this to ScriptureBookScores.
src/SIL.Machine/QualityEstimation/ChapterScores.cs line 5 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { internal class ChapterScores
I would rename this to ScriptureChapterScores.
src/SIL.Machine/QualityEstimation/SequenceScore.cs line 3 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { internal class SequenceScore : Score
I would rename this to TextSegmentScore.
src/SIL.Machine/QualityEstimation/TxtFileScores.cs line 5 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { internal class TxtFileScores
I would rename this to TextScores.
src/SIL.Machine/QualityEstimation/VerseScore.cs line 5 at r2 (raw file):
namespace SIL.Machine.QualityEstimation { internal class VerseScore : Score
I would rename this to ScriptureSegmentScore.
Enkidu93
left a comment
There was a problem hiding this comment.
@Enkidu93 partially reviewed 22 files and all commit messages.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on mshannon-sil and pmachapman).
Fixes #387
I wasn't sure on what values to use in unit tests (as I haven't seen any example data), so chose numbers that would get what I want, not necessarily values that match reality.
Also, some variable and parameter naming is possibly inconsistent due to my misunderstanding the Python source.
This change is